Fully Character-Level Neural Machine Translation without Explicit Segmentation

نویسندگان

  • Jason Lee
  • Kyunghyun Cho
  • Thomas Hofmann
چکیده

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subwordlevel encoder on WMT’15 DE-EN and CSEN, and gives comparable performance on FIEN and RU-EN. We then demonstrate that it is possible to share a single characterlevel encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We also observe that the quality of the multilingual character-level translation even surpasses the models trained and tuned on one language pair, namely on CSEN, FI-EN and RU-EN.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Character-level Decoder without Explicit Segmentation for Neural Machine Translation

The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder– decoder with a subword-level enco...

متن کامل

NYU-MILA Neural Machine Translation Systems for WMT'16

We describe the neural machine translation system of New York University (NYU) and University of Montreal (MILA) for the translation tasks of WMT’16. The main goal of NYU-MILA submission to WMT’16 is to evaluate a new character-level decoding approach in neural machine translation on various language pairs. The proposed neural machine translation system is an attention-based encoder–decoder wit...

متن کامل

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree stru...

متن کامل

A Character-Aware Encoder for Neural Machine Translation

This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words. On the use of row convolution (Amodei et al., 2015), the encoder of the proposed model composes word-level information from the input sequences of characters automatically. Since our model doesn’t rely on the boundaries between each wo...

متن کامل

Neural Sequence-to-sequence Learning of Internal Word Structure

Learning internal word structure has recently been recognized as an important step in various multilingual processing tasks and in theoretical language comparison. In this paper, we present a neural encoder-decoder model for learning canonical morphological segmentation. Our model combines character-level sequence-to-sequence transformation with a language model over canonical segments. We obta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TACL

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2017